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Abstract. Retrievals of the isotopic composition of water 
vapor from the Aura Tropospheric Emission Spectrometer 
(TES) have unique value in constraining moist processes in 
climate models. Accurate comparison between simulated and 
retrieved values requires that model profiles that would be 
poorly retrieved are excluded, and that an instrument op- 
erator be applied to the remaining profiles. Typically, this 
is done by sampling model output at satellite measurement 
points and using the quality flags and averaging kernels from 
individual retrievals at specific places and times. This ap- 
proach is not reliable when the model meteorological con- 
ditions influencing retrieval sensitivity are different from 
those observed by the instrument at short time scales, which 
will be the case for free -running climate simulations. In this 
study, we describe an alternative, “categorical” approach to 
applying the instrument operator, implemented within the 
NASA GISS ModelE general circulation model. Retrieval 
quality and averaging kernel structure are predicted empiri- 
cally from model conditions, rather than obtained from collo- 
cated satellite observations. This approach can be used for ar- 
bitrary model configurations, and requires no agreement be- 
tween satellite-retrieved and model meteorology at short time 
scales. To test this approach, nudged simulations were con- 
ducted using both the retrieval-based and categorical opera- 


tors. Cloud cover, surface temperature and free-tropospheric 
moisture content were the most important predictors of re- 
trieval quality and averaging kernel structure. There was 
good agreement between the <5D fields after applying the 
retrieval-based and more detailed categorical operators, with 
increases of up to 30 %c over the ocean and decreases of up 
to 40 %c over land relative to the raw model fields. The cat- 
egorical operator performed better over the ocean than over 
land, and requires further refinement for use outside of the 
tropics. After applying the TES operator, ModelE had <5D bi- 
ases of — 8%c over ocean and — 34 %c over land compared to 
TES <5D, which were less than the biases using raw model 5D 
fields. 

1 Introduction 

In order to usefully compare model predictions against satel- 
lite measurements, various features of the retrieval must be 
taken into account. For retrievals of trace-gas profiles based 
on optimal estimation, these are: the effects of the satellite’s 
orbital path, varying retrieval sensitivity under different at- 
mospheric conditions, limited vertical resolution, and con- 
tributions from prior constraint profiles. This involves ex- 
cluding profiles that would be poorly retrieved, and, for the 
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profiles remaining, applying an instrument operator to the 
raw model profiles. This transforms the raw model fields of 
interest into what would be seen by the instrument. By com- 
paring the modified profiles against the satellite retrievals, 
genuine model errors can be more readily identified. 

The vertical sensitivity of each retrieval to the true vertical 
profile is represented by an averaging kernel, which depends 
on factors such as cloud cover and surface temperature. In ap- 
plying the instrument operator to the model field, the choice 
of quality filtering, prior and averaging kernels should be as 
specific as possible to the model conditions at each time and 
location. Under the presence of thick clouds, for instance, in- 
frared retrievals are typically of poor quality and excluded 
from any analysis of the satellite data; the same filter needs 
to be applied to the model data in these conditions. This is 
also true for averaging kernel structure. For a high quality 
retrieval over low clouds, the peak retrieval sensitivity will 
be at a greater height than for clear sky conditions, all other 
factors being equal. 

Suitable quality filtering and averaging kernel selection is 
commonly assumed to be achieved by sampling the model 
fields along the orbital path of the satellite and using infor- 
mation from individual retrievals. The assumption underly- 
ing this approach is that the modeled meteorological con- 
ditions influencing retrieval sensitivity and averaging ker- 
nel structure are in good agreement with those viewed by 
the instrument. However, persistent differences between the 
observed and modeled clouds, for example, would lead to 
unsuitable quality filtering, averaging kernel selection, and 
possibly inaccurate diagnostics. When the quality filtering 
and averaging kernels selection are poor, differences between 
the satellite and the model for the quantity of interest can- 
not be attributed solely to model error, which is the goal, 
but also to this poor selection, defeating the purpose of ap- 
plying the instrument operator. Selection error will increase 
with fewer constraints on the modeled meteorology. It is pre- 
sumably smaller for chemical transport models (CTMs) with 
fully-prescribed, assimilated meteorology, and increases for 
coupled chemistry-climate models with nudged meteorolog- 
ical components such as horizontal winds. For free-running 
simulations, there is no expectation that the modeled and 
instrument-measured meteorological fields agree at short 
time scales. To the best of our knowledge, however, the effect 
of errors in the meteorology (e.g. clouds) on retrieval quality 
filtering and averaging kernel selection has not been assessed 
in any of these cases. 

Our interest is in retrievals of the deuterium composi- 
tion of water vapor (HDO) from the Tropospheric Emission 
Spectrometer (TES). These data have unique potential value 
in understanding moist processes in the atmosphere (Sher- 
wood et al., 2010), and for our purposes, in constraining 
cloud physics parameterizations. For this purpose, perturbed 
physics tests of convective parameters with nudged winds 
can provide a useful evaluation of the subgrid physics with 
realistic boundary conditions, while free-running simulations 


are important when parameterization changes can feedback 
strongly onto the large-scale circulation. But in the latter 
case, because we have no expectation of time-evolving agree- 
ment between the free-running model and observed weather, 
the standard approach to retrieval quality filtering and aver- 
aging kernel selection cannot be used reliably. This is par- 
ticularly important in the case of deuterium because cloud 
processes will strongly influence the isotopic composition of 
vapor, and also its measurability. 

In this study, we examine the assumptions underlying the 
standard, retrieval-based approach to applying the TES HDO 
operator and describe an alternative “categorical” approach 
for use specifically with free-running climate model simu- 
lations. The categorical approach relies as little as possible 
on short time-scale agreement between the model and instru- 
ment of quantities that influence retrieval quality and aver- 
aging kernel structure. It instead uses their dependence on 
atmospheric conditions, similar to those identified by Lee et 
al. (2011), in trying to predict the retrieval quality and av- 
eraging kernel structure for a given set of model conditions. 
Our approach was also motivated by the progress made in 
cloud simulators (e.g. Bodas-Salcedo et al., 2011) in that we 
apply the TES operator as an instrument simulator within the 
NASA GISS ModelE general circulation model (GCM). Our 
focus is on the tropics, in order to evaluate the performance 
of the TES operators under a limited set of conditions, and 
where our future process-based studies will be initially con- 
ducted. 

The paper is structured as follows. Section 2 describes the 
TES HDO retrievals and the factors which influence retrieval 
quality and averaging kernel structure. The GISS ModelE is 
described in Sect. 3. The standard, retrieval-based TES op- 
erator and its suitability are described in Sect. 4. The new, 
categorical TES operator and its suitability are described in 
Sect. 5. In Sect. 6, the effects of applying the two types of 
TES operators on the modeled <5D fields are examined, sev- 
eral sensitivity tests are described, and the retrieved and mod- 
eled <5D fields are briefly compared. A brief discussion fol- 
lows in Sect. 7. Future studies will examine the reasons for 
model-satellite <5D discrepancies in detail. 


2 TES HDO/H 2 0 Retrieval 

2.1 TES HDO retrieval and instrument operator 

The TES instrument onboard the Aura satellite is an infrared 
Fourier transform spectrometer measuring in the 650 cm -1 
to 3050 cm -1 spectral range, following a sun-synchronous 
orbit with a repeat cycle of 16 days (Beer et al., 2001). 
We use version 4 level 2 H 2 O and HDO nadir retrievals 
which have a horizontal footprint of 5.3 km by 8.5 km. 
H 2 O and HDO amounts are jointly retrieved using opti- 
mal estimation, using spectral windows in the region be- 
tween 1100 cm -1 and 1350 cm -1 (Worden et al., 2006). 
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The retrieved profiles represent an adjustment from the prior 
H 2 O and HDO constraint profiles. The adjustment is es- 
timated iteratively to minimize the difference between the 
measured spectra and that predicted by a forward radiative 
transfer model using the estimated profiles as input (Clough 
et al., 2006). Retrieved profiles are provided on 67 pressure 
levels. 

For HDO, a single, constant HDO/H 2 O profile from the 
global mean of the NCAR CAM model is used for the prior 
constraint. For H 2 O, the prior varies by retrieval, and is ob- 
tained from collocated grid points from the GEOS-5 global 
transport model operated by the NASA Global Modeling and 
Assimilation Office (GMAO) (Rienecker et al., 2007). A sin- 
gle, fixed H 2 O constraint would yield poor-quality retrievals 
because H 2 O amount can vary so widely in the troposphere. 
The retrieval is based on the logarithm of H 2 O and HDO pro- 
files because of their potentially large variation in the verti- 
cal, and to ensure positive retrieved amounts. The estimated 
error of the retrieved HDO is 10% in the tropics (Worden 
et al., 2007b). All analysis is for daytime retrievals only, for 
compatibility with the simulated ISCCP cloud properties (de- 
scribed in Sect. 3). 

The TES HDO instrument operator applied to model pro- 
files can be described as follows. Using the notation of Wor- 
den et al. (2011), the model HDO/H 2 O ratio xr suitable for 
comparison with satellite measurements is expressed as 

i R =xf+(A DD -A H D) (x D -x°) -(Ahh-Adh) (x H -x“j (!) 

In Eq. (1), the subscripts and superscripts indicate the follow- 
ing: “R” relates to the isotopic ratio HDO/H 2 O, “a” relates 
to a prior constraint, “D” relates to HDO and “H” relates to 
H 2 O. In Eq. (1), is the prior isotopic ratio HDO/H 2 O be- 
fore standardization with respect to Vienna Standard Mean 
Ocean Water (VSMOW), x® is the prior HDO amount and 
xj/ is the prior H 2 O amount, xd and xh are the raw, modeled 
HDO and H 2 O amounts, respectively. All x terms are the log- 
arithm of the isotopic ratio or species amount, i.e. x = In(^), 
where q is the species amount in units of volume mixing ratio 
(vmr). The x terms are column vectors of size 67 x 1, with 
modeled amounts interpolated linearly from the 40 model 
levels. Add is the HDO averaging kernel, Ahh is the H 2 O 
averaging kernel, and Ahd and Adh are the cross-kernels be- 
tween them. The cross kernels represent the sensitivity of one 
retrieved species to the actual profile of the other. All aver- 
aging kernels are square but asymmetric matrices with size 
67 x 67. 

Following Risi et al. (2012), the full 67 TES pressure lev- 
els were truncated to the vertical range relevant to HDO anal- 
ysis. The xr and xjj vectors were truncated to the 10 TES 
pressure levels spanning the 909 hPa to 383 hPa range, where 
the HDO retrievals are somewhat sensitive. The x®, x xd 
and xh vectors were truncated to the 26 TES levels span- 
ning the 1000 to 100 hPa range, HDO and H 2 O composi- 
tion over which can influence the retrievals over 907 hPa to 
383 hPa. Accordingly, each of the averaging kernel matrices 


is truncated to size 10 x 26. This truncation reduces compu- 
tation time and storage requirements for the TES data con- 
siderably, with little effect on the results (Risi et al., 2012). 
Most analysis presented in this study is further restricted to 
the 825 hPa to 5 10 hPa range where the HDO retrieval is most 
sensitive, following Yoshimura et al. (201 1), and which spans 
the ~ 600 hPa level examined by Berkelhammer et al. (2012) 
and Risi et al. (2012). TES measurements were mapped to 
the 2° x 2.5° ModelE grid. 

The overall sensitivity of the retrieval is measured by the 
trace of the HDO averaging kernel Add- HDO retrieval sen- 
sitivity is influenced by cloud thickness and height, sur- 
face temperature and moisture content (Worden et al., 201 1). 
Only retrievals classified as high quality are included, which 
was defined as having sensitivity greater than 0.5 (Lee et al., 
2011; Berkelhammer et al., 2012; Risi et al., 2012) and the 
overall HDO retrieval quality flag set to 1 . The minimum sen- 
sitivity requirement ensures that the retrieval is sufficiently 
sensitive over some vertical range to the measured spectra, 
and not dominated by contributions from the prior constraint. 
Figure 1 shows an example TES nadir orbital path during 
daytime over the tropics for one day. Of 133 measurements, 
only the 85 high-quality retrievals are shown. Example aver- 
aging kernels for one high quality retrieval over the Indian 
Ocean are shown in Fig. 2. After the quality filtering, we 
adopt the pressure level of peak sensitivity for a given level 
of retrieved HDO, defined as p\), as the key characteristic of 
the operator. In Fig. 2a, po for both the 619 hPa (purple) and 
681 hPa (light blue) is approximately 700 hPa. The mean po 
between 825 hPa and 5 10 hPa will be the primary metric used 
for distinguishing averaging kernel shapes. 

Figure 3 shows the spatial variation of retrieval quality 
and pd across the tropics during 2006-2009. There were 
202 713 daytime retrievals, 69% of which were high qual- 
ity over the ocean and 57 % over land, but with consider- 
able spatial variation (Fig. 3a). Over the oceans, there were 
fewer high-quality retrievals over the ITCZ and SPCZ bands, 
eastern Indian Ocean, the Maritime Continent, and the West 
Pacific Warm Pool due to the frequent presence of precipi- 
tating clouds. There is also lower retrieval quality off of the 
west coasts of South America and Africa possibly due to low 
moisture content and lower sea-surface temperatures. Over 
land, the lowest quality is over the Sahara, presumably due 
to low moisture content. Given that the retrieval quality can 
decrease under either very wet or very dry conditions, there 
is no apparently simple rule which would separate low and 
high quality retrievals. 

Over 825 hPa to 510hPa, there is also considerable vari- 
ation in pd for high quality retrievals (Fig. 3b). Over the 
oceans, pn is lower (at a higher altitude) in moist regions 
where there is abundant mid tropospheric moisture, but also 
in the dry regions off of the coasts of South America and 
Africa presumably due to low-level marine stratocumulus, 
as described by Lee et al. (2011). pu is higher over the dry 
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Fig. 1. Aura TES nadir orbit for 9 December 2006 during daytime over 15° S to 15° N. Only the 85 high quality HDO retrievals are shown, 
of 133 in total. 




c) A m d) A m 




Fig. 2. Averaging kernels for the retrieval at 10:00 UTC, 9 Decem- 
ber 2006 at (3.0° N, 53.75° E). 


subtropical anticyclones due to a moist boundary layer and 
dry free troposphere. 

2.2 Observed controls on TES HDO retrieval quality 
and po 

The first task in developing the new approach is to under- 
stand controls on retrieval quality and pn in the TES mea- 
surements. Possible controls were identified using the pattern 
correlations between Fig. 3a and b and different underlying 
meteorological quantities. The following variables were con- 
sidered from mean fields calculated from 2006-2009: cloud 
optical depth (r), cloud fraction (CF), defined as the per- 
centage of retrievals in a grid cell with cloud optical depths 
greater than 0.3, cloud top pressure (CTP), surface temper- 
ature (7s), and moisture content. Moisture content was ex- 
pressed as total precipitable water (PWy) and further sep- 
arated into precipitable water in the boundary layer (PWb) 
(within 150 hPa of the surface) and precipitable water in the 
free atmosphere (PWp) (above 150 hPa from the surface). All 
moisture quantities were computed from the prior FPO pro- 


files, which are sampled from GMAO reanalysis. The analy- 
sis of controls on po is for high quality retrievals only, for 
both the averaging kernels and underlying meteorological 
quantities. Correlation and regression quantities were com- 
puted using ordinary least-squares regression, which does not 
take into account errors in the control variables. 

Table 1 lists the pattern correlations. Over the ocean, re- 
trieval quality was most strongly associated with CF, with a 
correlation of —0.70, indicating that, as would be expected, 
retrieval quality decreases with increasing cloud cover. Com- 
pared to CF, r was a weak predictor of retrieval quality, likely 
because of its highly non-normal distribution. Over land, re- 
trieval quality was most strongly associated with T$, with 
a correlation of —0.72 and to a slightly lesser degree, with 
PWb, (which itself has a correlation of —0.59 with T$). Over 
the ocean, po is most strongly associated with PWp. As PWp 
decreases, pn moves toward the boundary layer where mois- 
ture is abundant, and will therefore exert a stronger influence 
on the retrieved HDO at higher altitudes. PWb itself had a 
low correlation with pd because it varies substantially less 
than PWp over the ocean. Over land, pn was most strongly 
associated with 7s, but with a lower correlation of —0.51 
compared to over ocean, and equally high correlation with 
PW F . 

The linear fits between retrieval quality and pn for the pri- 
mary control variables are shown in Fig. 4. It can also be seen 
that the observed control on retrieval quality over land is due 
to a set of high-temperature, low quality points, which were 
associated with extremely hot and dry conditions over the 
Sahara (Fig. 3a). The unexplained variation in these relation- 
ships is due to the influence of the more weakly correlated 
variables and other unknown factors. We considered adopt- 
ing multivariate regression models to capture this variability, 
but found that the collinearity between meteorological quan- 
tities led to unstable regression estimates, and that remedial 
measures such as principal component regression precluded 
straightforward interpretation. 

Comparisons such as those in Fig. 4 will serve as the pri- 
mary means of evaluating the suitability of different TES op- 
erators. It is these relationships that we seek to evaluate for 
different TES HDO operators in the model, namely that: 

- Retrieval quality should decrease where there is increas- 
ing CF over ocean and increasing T$ over land. 
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a) RetrQual (%): TES OcnMean: 69.0 LndMean: 57.1 



b) P D (hPa): TES 


OcnMean: 580.9 LndMean: 587.8 


100 

50 
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Fig. 3. (a) TES HDO retrieval quality and (b) mean pp) (height of peak HDO sensitivity) over 825 to 5 10 hPa for high quality retrievals only. 
Both fields are the mean across all retrievals from 2006-2009. 


Table 1 . Pattern correlation between TES HDO retrieval quality (Fig. 3a) and pjy (height of peak HDO averaging kernel sensitivity) (Fig. 3b) 
and candidate variables. Cloud fraction is the frequency of occurrence within a grid cell of observations with x greater than 0.3. For pjy, 
correlations are for high-quality observations only. The strongest correlation in for each column is shown in bold. 


Description 

Variable 

Retrieval quality 

Pressure of peak HDO 
sensitivity ( pd ) 

Ocean 

Fand 

Ocean 

Fand 

Cloud optical depth 

T 

-0.38 

0.30 

-0.39 

-0.35 

Cloud fraction (%) 

CF 

- 0.70 

0.39 

-0.55 

-0.28 

Cloud top pressure (hPa) 

CTP 

0.33 

0.15 

0.13 

0.00 

Prep, water in bdy. layer (mm) 

PW B 

-0.15 

0.57 

-0.29 

-0.39 

Prep, water in free. atm. (mm) 

PW F 

-0.43 

0.32 

- 0.70 

-0.50 

Prep, water total (mm) 

PW T 

-0.35 

0.44 

-0.58 

-0.48 

Surface temperature (K) 

7s 

-0.28 

- 0.72 

0.04 

0.51 


- pd should move closer to the boundary layer as PWp 
decreases over the ocean, and move closer to the free 
troposphere as 7's decreases over land. 

- The scatter in the linear fits is similar to that observed in 
the TES measurements. That is, the dispersion of the 
residuals around the fitted regression lines should be 
similar to those in Fig. 4. 


3 NASA GISS ModelE 

We use the atmosphere-only version of the NASA GISS 

ModelE general circulation model at 2° x 2.5° horizontal 
resolution and 40 vertical levels. The core model is an up- 

dated version of that described in Schmidt et al. (2006), with 

a recent summary of the cloud physics provided by Kim et 

al. (2011). The simulation period was 2006-2009, covering 
the continuous period of TES retrievals, with an additional 
year for spin-up. A spin up time of five years did not af- 
fect the results. Internannually-varying monthly sea-surface 
temperatures and sea-ice cover are prescribed (Rayner et 
al., 2003). The horizontal winds in the model were nudged 
toward NCEP-NCAR Reanalysis (Kalnay et al., 1996) at 
each model time-step. All other dynamical quantities are cal- 
culated prognostically. Our eventual interest is evaluation 


of free-running simulations against the TES observations, 
but nudging allowed for consistent comparison between the 
retrieval-based and categorical TES operators for a config- 
uration typical of how the retrieval-based operator has been 
commonly applied in the past. 

ModelE is equipped with stable water isotope tracers 
(Schmidt et al., 2005), advected using the quadratic upstream 
scheme of Prather (1986), which yields an effective transport 
resolution approximately twice that of the horizontal model 
resolution. Isotopic fractionation between H 2 O and the rare 
isotopologues H* 3 * * * * 8 0 and HDO is parameterized for all moist 
processes, from evaporation and evapotranspiration over the 
ocean and land surfaces, to condensation and deposition, and 
post-condensation exchange between rainfall and vapor. The 
stable water isotope tracer parameterization is much sim- 
pler than the underlying cloud parameterization, and is more 
tightly constrained by laboratory measurements. This is what 
makes the TES HDO retrievals potentially valuable, as iso- 
topic measurements can be used in evaluating the underlying 
cloud physics with a fair amount of confidence that the iso- 
topic physics are correct. Or, put another way, errors in the 
modeled isotopic fields are likely to be dominated by errors 
in the cloud physics rather than errors in the isotopic physics. 

ModelE also includes an internal simulator for the Inter- 
national Satellite Cloud Climatology Project (ISCCP) that 
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Fig. 4. TES retrieval quality (left) and pd (height of peak HDO sen- 
sitivity. right) as a function of primary control variables identified 
in Table 1 over ocean (top) and land (bottom). Dashed lines show 
the 95 % prediction intervals. 


produces cloud diagnostics for comparison with the ISCCP 
datasets (Klein and Jakob, 1999). For our purposes, the key 
feature of the ISCCP simulator is the random, subgrid joint 
distribution of r and CTP, conditioned upon the grid-scale 
vertical distributions of humidity, convective cloud cover and 
large-scale cloud cover. 


4 Retrieval-based TES HDO operator 

4.1 Review of retrieval-based operators in previous 
studies 

In applying the TES operator in Eq. (1) to model profiles 
xd and jch, choices must be made whether to include the 
profile, in choosing the prior profile jc!/, and the averaging 
kernels Add- Ahh . Ahd and Aqh, all of which are different 
for each retrieval. These choices should reflect the conditions 
at each model point. Using the standard, retrieval-based ap- 
proach, the model fields are sampled along the orbital path, 
but excluding model points collocated with poor-quality re- 
trievals. For the remaining model points, the averaging ker- 
nels and priors from individual measurements are used in ap- 
plying Eq. (1). The underlying assumption of this approach 
is that the modeled and retrieved factors influencing retrieval 
quality and averaging kernel structure are in agreement. 

This approach is based on the earlier, pre-Aura launch de- 
scription of Jones et al. (2003) of the potential accuracy of the 
TES CO retrievals. Variants of the technique have been used 


in validating TES retrievals against collocated measurements 
of CO from aircraft (Luo et al., 2007a), O3 from aircraft 
(Richards et al., 2008) and sondes (Worden et al., 2007a), 
and H?0 measurements from sondes (Shephard et al., 2008). 
It has also been used for comparisons between TES CO re- 
trievals and those from the Atmospheric Chemistry Exper- 
iment (ACE) (Rinsland et al., 2008) and Measurements of 
Pollution in the Troposphere (MOPITT) (Luo et al., 2007b). 

The approach has subsequently been applied in CTM stud- 
ies focusing on TES O3 data assimilation (Parrington et al., 

2008) , the sources, sinks and transport of pollution in the tro- 
posphere (Nassar et al., 2009; Choi et al., 2010; Liu et al., 

2009) , and inverse modeling of CO (Jones et al., 2009) and 
CO2 (Nassar et al., 2011). These studies all involved CTMs 
with fully prescribed meteorological fields. In studies us- 
ing the GEOS-Chem CTM, meteorology is prescribed from 
GMAO reanalysis. Through its assimilation of radiosonde 
profiles of temperature, humidity and winds, and indepen- 
dent satellite estimates of atmospheric moisture and winds, 
the GMAO reanalysis provides reasonable estimates of the 
factors which are known to influence averaging kernel struc- 
ture and retrieval sensitivity (e.g. Norris and Da Silva, 2007). 

Voulgarakis et al. (2011) applied the TES operator using 
the retrieval-based approach in their analysis of O3-CO cor- 
relations for two coupled chemistry-climate models with pre- 
scribed SSTs and horizontal winds nudged toward reanaly- 
ses. All other meteorological fields were calculated prognos- 
tically, unlike the CTM studies described above. Aghedo et 
al. (2011) considered three chemistry-climate models with 
prescribed SSTs and nudged toward reanalysis, and using 
collocation-based averaging kernel selection and quality fil- 
tering. A fourth free-running (non-nudged) simulation was 
also considered. Their focus was on estimating the error as- 
sociated with using monthly mean maps of spatially-varying 
averaging kernels rather than individual retrievals. A small 
error would allow the TES operator to be applied to monthly 
mean model output, simplifying multi-model comparisons 
against satellite measurements. We note that by embedding 
the TES operator within the model, we have avoided this is- 
sue altogether. 

Risi et al. (2012) used the monthly-mean approach in com- 
paring TES HDO fields to those from nudged simulations 
with the LMDz isotopically-equipped GCM for several dif- 
ferent parameter values within the cloud scheme. Yoshimura 
et al. (2011) used the standard approach using individual 
retrieval-based sampling in their comparison of the TES and 
IsoGSM HDO fields with varying isotopic physics, noting 
that this approach necessitates model nudging. Both stud- 
ies stressed the importance of applying the TES operator to 
model outputs for quantitative comparisons with the data. 
Lee et al. (2009) and Field et al. (2010) compared TES HDO 
to free-running simulations with different convective and iso- 
topic configurations, but without applying a TES operator, 
making their interpretation necessarily qualitative. 
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4.2 Retrieval-based controls on TES HDO retrieval 
quality and po 

The standard, retrieval-based TES operator was implemented 
within ModelE for FEO and HDO. TES retrievals are in- 
gested into the model’s TES simulator along Aura’s orbital 
path (as in Fig. 1) at each half-hour model time step, during 
daytime and over the tropics only. The retrieval quality filter- 
ing and averaging kernel selection is done regardless of the 
agreement in meteorology between the model and TES. In 
cases where a model cell contains more than one high qual- 
ity TES measurement, the averaging kernels and H 2 O priors 
for all are applied to the model profile and the mean of the 
resulting profiles is taken. 

We evaluated the suitability of this approach by comparing 
the relationships in Table 1 for the TES observations to those 
from the retrieval-based operator. If the modeled meteorol- 
ogy agreed exactly with that retrieved by the instrument, then 
the relationships between retrieval quality and pn would be 
the same as in Table 1 when the control variables from TES 
are replaced with those from the model. The degree to which 
this is not the case quantifies the difference in meteorology 
observed by TES and simulated by the model in the context 
of their influence on retrieval quality and p\). 

Figure 5 shows the same observed TES retrieval quality 
and p\) as Fig. 4, but as a function of modeled CF and PWp 
over the ocean and T$ over land. Over the ocean, there is too 
weak a decrease in observed retrieval quality with increasing 
model CF, indicated by the slope of —0.18 and weaker cor- 
relation of —0.26 (Fig. 5a). This reflects, despite nudging, 
the low correlation of 0.35 between the TES and ModelE 
CF. The regions where TES is excluding more retrievals do 
not always correspond to where the thick clouds are in the 
model, for example. There is less disagreement in control on 
retrieval quality over land (Fig. 5c), because of the higher 
correlation of 0.74 between modeled and retrieved Ts- Com- 
pared to retrieval quality, the observed controls on p\) over 
the ocean are better captured by the retrieval-based operator 
(Fig. 5b). This is also due to the strong correlation between 
the TES and ModelE ocean PWp fields (0.86), which leads 
to a similar relationship with pp>. Over land, the relationship 
between modeled Ts and p\) is in fair agreement with, but 
slightly weaker than for the observed T$. 


5 Categorical TES HDO operator 

5.1 Description of categorical operator 

The observed TES retrieval quality and, to a lesser extent po, 
are not entirely consistent with the underlying model condi- 
tions, despite nudging. This problem will be worse for free- 
running simulations. We have therefore developed a tech- 
nique to apply the TES operator in a way that presumes no 
agreement between the observed and modeled meteorology 



Fig. 5. Same as Fig. 4, but with control variables from TES replaced 
with those from ModelE. Black dashed lines show the correspond- 
ing linear fits from TES observations in Fig. 4. 


at short time-scales, but such that the retrieval quality and av- 
eraging kernel selection are suited to the modeled conditions 
at that point. This approach is referred to as the “categorical 
operator’’ and was implemented alongside the retrieval-based 
operator in ModelE. 

For different categories defined according to the variables 
in Table 1, we computed the mean retrieval quality, mean av- 
eraging kernels, and mean H 2 O prior from the TES retrievals 
(described in detail in the next Section). The mean retrieval 
quality is the proportion of HDO retrievals in a category that 
were classified as high quality. The mean of the averaging 
kernels is the matrix resulting from taking the element-by- 
element means of all averaging kernels (for high quality re- 
trievals only) falling into a given category. Applying the cate- 
gorical TES operator in the model then consists of two steps: 

1 . At each time step and grid point, the values of the cat- 
egorical variables in the model are used to look up the 
associated categorical TES retrieval quality. The model 
profile is included with a probability equal to the cate- 
gorical retrieval quality. If a particular set of model con- 
ditions was associated with 30 % high quality retrievals, 
for example, then there is a 30 % chance that that model 
profile would be included. 

2. For the profiles passing the retrieval quality filter, the 
categorical variable values in the model are used to look 
up the associated prior H 2 O profile and averaging ker- 
nels, which are used in applying Eq. (1). 
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Thus, rather than use information from individual re- 
trievals, we use conditions in the model to empirically pre- 
dict the retrieval quality and averaging kernel structure for a 
sampled model point. 

5.2 TES categorizations 

Thirteen categorizations of increasing complexity were con- 
sidered, which ranged from having one category across all 
retrievals to 1620 categories when the retrievals were sep- 
arated according to discrete ranges of all control variables. 
Table 2 shows the values used for each variable in different 
categorizations. CF is not retrieved for individual measure- 
ments, but is included implicitly for the categories involving 
clouds by including a clear sky category with r less than 0.3. 
An important element of the categorical operator is our use of 
the ISCCP simulator in ModelE. Rather than use grid-mean 
values of r and CTP, we randomly select an ISCCP subgrid 
column with equal probability and use its r and CTP. The 
subgrid r will not be normally distributed; a single, large r 
can skew an otherwise clear-sky grid box toward an unrep- 
resentatively high r in the grid-scale mean. Using the indi- 
vidual ISCCP subgrid columns guards against an inevitable 
bias toward high r values with low retrieval sensitivity that 
would result if the grid-scale mean were used. Inclusion of 
low sensitivity retrievals would result in comparison of re- 
trieved and, after applying the TES operator, model profiles 
that have both relaxed toward the prior, creating artificially 
high agreement between the satellite and model (Nassar et 
al., 2008). 

Categorizations are named according to the variables they 
include. We tried to strike a balance between capturing dis- 
tinctions in retrieval quality and averaging kernel structure 
and using as few categories as possible. The cloud-only C 
categorization extends the decomposition of Lee et al. (201 1) 
to the coarse, qualitative ISCCP categories. The Cfi ne cat- 
egorization corresponds to the full ISCCP categories. The 
PW and PW'ime categorizations use precipitable water only, 
and contribute 9 and 49 categories, respectively when pre- 
cipitable water is separated into boundary layer and free- 
atmosphere components. The LOrTPWp categorization with 
180 categories included only the variables identified in Ta- 
ble 1 as the most important (land/ocean separation, r, 7$ and 
PWp). This was a possible optimal categorization that cap- 
tures variation in retrieval quality and po using far fewer cat- 
egories than the full LOCTPW categorization which includes 
all variables. 

To show how retrieval quality and averaging kernel struc- 
ture varies, we look first at the C categorization based on r 
and CTP. Retrievals with r less than 0.3 account for 64 % 
of observations, with the rest consisting mostly of mid- and 
high-level clouds (Table 3). Retrieval quality is generally 
high for r less than 1.3, and for low-level clouds with r be- 
tween 1.3 and 3.6 (Table 4), but otherwise poor. The rela- 
tively poor quality of 68.2% for the low r and high CTP 


Table 2. Category values for different parameters. 


Identifier 

Description 

Category ranges 

LO 

C 

Land/ocean 

r 

0. 0.3. 1.3, 3.6, 23, >23 

Cfine 

CTP (hPa) 
r 

0, 440, 680. > 680 

0. 0.3. 1.3, 3.6, 9.4, 23, 60, > 60 

T 

CTP (hPa) 
7s (K) 

0. 180. 310. 440. 560. 680. 800. > 800 
<295,295,300, 305,310, 315, > 315 

PW 

PW B .PW F (mm) 

0, 10, 20, > 20 

PW fln e 

PW B .PW F (mm) 

0, 5, 10, 15, 20, 25, 30, > 30 


category suggests an additional factor influencing retrieval 
quality, such as 7s over land. 

To illustrate the associated changes in averaging kernel 
structure. Fig. 6 shows the averaging kernel rows at 619 hPa 
for CTP less than 440 hPa and three different ranges of r. 
Averaging kernels rows for r less than 0.3 (Fig. 6a) have a 
higher po than for r between 0.3 and 1.3 (Fig. 6b), but nei- 
ther peak is particularly sharp. Neither is significantly differ- 
ent from the grand mean because these categories constitute 
such a large proportion of all retrievals. Sensitivity for thicker 
clouds is generally low (Fig. 6c), even with only high qual- 
ity retrievals included, and the averaging kernel has a much 
flatter peak. The average retrieval quality for this category is 
11%. Model points corresponding to these conditions would 
in general be excluded from the analysis. 

The CPW categorization extends the C categorization by 
further separating the retrievals according to PWb and PWp, 
which may vary independently of cloud cover. Figure 7 
shows the averaging kernels underlying the mean in Fig. 6a, 
but for a moist boundary layer (PWg greater than 20 mm) 
and for three categories of PWp. The main distinction is that 
po increases from 600 hPa in Fig. 7a to 800 hPa in Fig. 7c 
as PWp decreases. The error bars are also narrower than in 
Fig. 6a, and particularly for the low PWp case, the peaks are 
sharper than in separating based on r only in Fig. 6a and 
Fig. 6b. Although the focus of the averaging kernel separa- 
tion is the Add row at 619 hPa, the corresponding changes in 
the FLO prior xj, 1 (not shown) were as expected, with the xj, 1 
decreasing strongly above the boundary layer for PWp less 
than 10 mm. 

Before applying the TES operator, we can gauge how more 
complicated categorizations might yield a better mapping 
from model conditions to retrieval quality and the most suit- 
able averaging kernels. Of interest is the degree to which 
different categorizations separate high from poor quality re- 
trievals, and for the high quality retrievals, the degree to 
which pd is separated. This is analogous to the correlations 
in Table 1, but for a set of discretized predictor variables. 

For each categorization, the separation between high and 
poor quality retrievals was measured by the mean differ- 
ence between each category’s quality and the overall mean 
quality. In computing the mean difference, each categorical 
quality is weighted by the number of observations, so that 
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Table 3. Frequency of occurrence (%) for TES retrievals for the C categorization. There was a total of 20713 retrievals during daytime over 
the tropics. 




Cloud optical depth 


0-0.3 

0.3-1. 3 

1. 3-3.6 

3.6-23 

>23 

Cloud top pressure (hPa) 0-440 

43.3 

7.3 

4.1 

5.8 

0 

440-680 

17.2 

4.2 

5.6 

1.6 

0 

680-1000 

3.5 

2 

2.5 

2 

0.8 


Table 4. Percentage of TES retrievals that were high quality for the C categorization. Overall, 69 % of retrievals were high quality. 


Cloud optical depth 



0-0.3 

03-1.3 

1.3-3. 6 

3.6-23 

> 23 

Cloud top pressure (hPa) 0-440 

79 

86.1 

11.6 

0 


440-680 

78.8 

85.5 

39 

10.4 


680-1000 

68.2 

81 

83.1 

64.4 

26.8 


low-quality categories with few observations are not over- 
represented. For the C categorization, this value is 18.4%, 
the mean of the absolute differences between the entries in 
Table 4 and the overall mean of 68 %, with the mean abso- 
lute difference in each category weighted by the frequency 
of occurrence entries in Table 3. Figure 8 shows this value 
for each of the twelve categorizations. Most of the separa- 
tion in retrieval quality can be obtained using only the sim- 
ple “C” categorization, with smaller contributions from other 
variables. This is consistent with the strong pattern correla- 
tion between retrieval quality and cloud fraction in Table 1 . 
The strongest additional gains are made by including 7s in 
the categorization (CT), consistent with its association with 
retrieval quality over land. Despite the importance of cloud 
properties in separating good retrievals from bad, little was 
gained by using the “Cfi ne ” categorization, which is likely 
due to the larger error in the cloud properties (Eldering et al., 
2008) compared to other categorical variables. 

Averaging kernel separation was measured by the total 
root-mean square error (RMSE) of pn at 619 hPa across all 
categories in a categorization. Only high quality retrievals 
were considered in calculating the p d RMSE for consistency 
with any analysis of the retrieved HDO fields. The pn RMSE 
can be thought of as the total, within-category standard de- 
viation of pn across all categories, weighted by frequency 
of occurrence. We are interested in the degree to which the 
total within-category variance pn decreases for increasingly 
complicated categorizations, or how the error bar widths tend 
to decrease across all categories within a categorization. A 
decrease in the pn RMSE would result in a better mapping 
between model conditions and averaging kernel shape. 

Figure 9 shows the total pn RMSE for the thirteen dif- 
ferent categorizations. Precipitable water plays a more im- 
portant role in separating pn than in separating retrieval 
quality. The PW categorization, for example, contributes to 


greater pn separation than the C categorization, despite hav- 
ing fewer categories. There is a further decrease for the CPW 
categorization, and also for the CTPW categorization. The 
LOrTPWp categorization appears to strike a balance be- 
tween minimizing the RMSE and using relatively few cat- 
egories, with further, slight decreases for the CTPW and full 
LOCTPW categorizations. 

From Figs. 8 and 9, all of clouds, precipitable water and 
surface temperature are important, which we would expect 
from Table 1. The cloud categories are important on their 
own in separating high from poor quality TES retrievals, and 
precipitable water provides most separation of pn- There are 
diminishing returns, however, as the size of the categoriza- 
tion increases. It is not immediately clear whether more com- 
plicated categorizations yield relationships closer to those in 
Table 1 or different <5D fields after applying the TES operator. 

5.3 Categorical controls on TES HDO retrieval quality 
and pn 

The categorical operator was tested in ModelE with 
four representative categorizations: C, PW, LOrTPWp and 
LOCTPW. In each case, the underlying model configuration 
was the same as in the case of applying the retrieval-based 
TES operator, but the quality filtering and averaging kernel 
and H 2 O prior selection from individual TES measurements 
were replaced with categorical selection. 

Figure 10 shows the approximated retrieval quality for the 
four categorizations. For the C categorization (Fig. 10a), the 
approximated retrieval quality bears some resemblance to the 
observed retrieval quality (Fig. 3a), but is 10 % lower over 
the ocean and without the sharp decrease in retrieval qual- 
ity over the southern Sahara. Over the Pacific and Atlantic 
sectors, the regions of high retrieval quality are to the east of 
those in the observations. The PW categorization (Fig. 10b) 
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Fig. 6. HDO averaging kernel rows (grey) at 618 hPa for CTP less than 440 hPa and for r (a) less than 0.3 (b) between 0.3 and 1.3 (c) 
between 1 .3 and 3.6. Black profiles show the grand mean HDO averaging kernel row at 618 hPa across all high quality retrievals. Error bars 
show the standard deviation at each level. These averaging kernels correspond to the first three entries in the top row of Table 4. 



Fig. 7. HDO averaging kernel rows (grey) at 618 hPa for CTP less than 440 hPa. r less than 0.3, PWjj greater than 20 mm, and PWp: (a) 
greater than 20 mm (15 % frequency, 81 % quality) (b) between 10 mm and 20 mm (14 % frequency, 79 % quality) (c) less than 10 mm (3 % 
frequency, 82 % quality). Black profiles show the grand mean HDO averaging kernel row at 618 hPa across all high quality retrievals. Error 
bars show the standard deviation at each level. 


results in a mean ocean retrieval quality of 68.9%, nearly 
identical to the TES observations, but lacks the distinction 
between wet and dry regions seen in the observations and for 
the C categorization. The approximated retrieval quality of 
the LOrTPWp and LOCTPW categorizations (Fig. 10c, d) 
are all similar over the ocean, with the LOCTPW categoriza- 
tion having a sharper decrease over the southern Sahara. 

While instructive to see the sensitivity of the retrieval qual- 
ity to the different categorizations, their performance should, 
strictly speaking, be evaluated according to how well they 
approximate the observed relationships in Fig. 4, rather than 
by their agreement with the observations in Fig. 3a. These re- 
lationships are shown for the four categorizations in Fig. 11. 
The C categorization (Fig. 11a) results in a slightly stronger 
relationship (r = —0.78) between the cloud fraction and the 
approximated retrieval quality than in the observations. This 
would be expected given that clouds are the only categori- 
cal variable used to select quality; in the absence of other, 
real, complicating factors, the approximated relationship is 
slightly too strong compared to the observed relationship in 


Fig. 4a. Furthermore, over the ocean, the lower approximated 
retrieval quality of 58.9 % is the result of the higher modeled 
CF (47.8 %) compared to the TES observations (35.3 %). 

Conversely, the PW categorization results in a weaker re- 
lationship between CF and retrieval quality (Fig. 1 lb). In this 
case, cloud fraction acts as a lurking variable in the cate- 
gorization. CF is somewhat correlated with PWb (0.48) and 
PWp (0.67), but not strongly enough to accurately predict re- 
trieval quality when excluded from the categorization. This 
case reinforces the need to evaluate the categorical opera- 
tor based on agreement in the relationships, rather than in 
the retrieval quality fields. Over the ocean, it is tempting to 
infer that the PW categorization is more accurate because 
of its agreement in the mean (Fig. 10b) with retrieval qual- 
ity. This agreement is misleading however; by not includ- 
ing clouds explicitly in the categorization, the approximated 
retrieval quality does not decrease under the higher mod- 
eled cloud fraction, which it should. The relationships are 
in better agreement, neither too strong nor too weak, for the 
LOrTPWp categorization (Fig. 11c), and to some extent the 
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Fig. 8. Mean difference between HDO retrieval quality within each 
category and the overall quality for the Single categorization (68 %), 
for twelve different categorizations. Differences are weighted by 
frequency of occurrence within each category. Numbers in paren- 
theses indicate the total number of categories in each categorization. 


LOCTPW categorization (Fig. lid) . Over land, the C and 
PW categorizations (Fig. 1 le, f) performed poorly in captur- 
ing the variation in retrieval quality over land. When 7s is 
not included in the categorization, there is too little covara- 
tion between 7s and either of CF, PWp or PWp to capture the 
decrease in retrieval quality with 7s. More realistic approxi- 
mations were obtained for the LOrTPWp and LOCTPW cat- 
egorizations (Fig. llg, h), which include 7s, and land/ocean 
separation, although there is still less agreement than for over 
the ocean. 

The approximated p p> for the five categorizations is shown 
in Fig. 12. The approximated p p for the C categorization 
(Fig. 12a) shows little of the variation seen in the TES obser- 
vations (Fig. 3b), with little increase in pn over the Pacific 
and Atlantic subtropical anticyclones. The PW categoriza- 
tion (Fig. 12b) does capture this increase, but not the lower 
Pd over the tropical rain belts, and with a smoother structure 
owing to the smoothness of the quality filtering. The approx- 
imated pd for the LOrTPWp and LOCTPW categorizations 
(Fig. 12c, d) were comparably similar to the TES pd fields 
over the ocean and land. 

Figure 13 shows the approximated controls on pp>. As in 
the observed relationships in Fig. 4b and Fig. 4d, PWp and 
7s include only model points classified as having high re- 
trieval quality. The weak slope of the C categorization over 
the ocean (Fig. 13a) reflects the absence of variation in pd in 
Fig. 12a. The slope for the PW categorization (Fig. 13b) is 
closer to the observed slope, but with an overly strong corre- 
lation, too little scatter, and with unrealistically high pd over- 
all. Similar to retrieval quality, the control on pd is more re- 
alistic when both clouds and precipitable water are included 
(Fig. 13c, d). The inclusion of clouds in the categorization 
helps to separate high PWp for clear and cloudy sky, allowing 
the clear sky values with higher quality to be included. The 
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Fig. 9. RMSE of pd (height of peak HDO sensitivity) for the twelve 
categorizations, and the “Single” categorization. Numbers in paren- 
theses indicate the total number of categories within each catego- 
rization. 


full LOCTPW categorization has a more realistic amount of 
scatter, but both that and the LOrTPWp categorizations have 
a steeper slope and higher correlation than in the observa- 
tions. The retrieval -based operator in Fig. 5b, by contrast, had 
a too-flat slope and weak correlation. Over land, the approx- 
imated 7s control on p\) was of the opposite sign for the C 
and PW categorizations (Fig. 13e, f), and best approximated 
by the full LOCTPW categorization (Fig. 13h). 

Overall, the LOrTPWp and LOCTPW categorizations 
performed best in approximating controls on retrieval qual- 
ity and pd- Both were equally deficient in not having a strong 
enough decrease in retrieval quality with 7s over land, and an 
overly strong increase in pd with PWp over the ocean. These 
are likely the greatest source of selection error in applying 
the categorical TES operator to raw model <5D fields. 

6 TES operator effects on <5D fields 

6.1 Comparison of retrieval-based and categorical TES 
operators 

Ultimately, we are interested in the effects of applying the 
different TES operators to raw ModelE <5D fields. Figure 14 
shows this effect for the retrieval-based TES operator over 
the whole analysis period. Again, the retrieval-based oper- 
ator has been applied regardless of agreement between the 
retrieved and modeled values of CF, PWp and 7s. The effect 
of sampling along the orbital path can be seen by the less 
smooth field of Fig. 14b compared to Fig. 14a. Application 
of Eq. (1) to the raw model fields after quality filtering results 
in an average <5D increase of 8.8 %c over ocean and 6.4 %c 
over land (Fig. 14c), but this reflects larger regional changes. 
In general, the largest absolute changes occur where there is 
the largest difference between the raw model field and the 
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d) RetrQual (%): ModelE_LOCTPW 


OcnMean: 62.3 LndMean: 53.8 
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50 

0 


Fig. 10. Approximated retrieval quality for four representative categorizations. 
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Fig. 11. Approximated retrieval quality as function of CF over the ocean (top) and T$ over land (bottom) for four representative categoriza- 
tions. Grey dashed lines show the 95 % prediction intervals. Black dashed lines show the corresponding linear fits from TES observations in 
Fig. 4a and c. 


prior i5D over 825 hPa to 510hPa, which is roughly — 1 50 %o 
when vertically weighted by specific humidity. Over northern 
Africa, the high model <5D decreases toward the prior by up 
to 40 %c, whereas over South America and the Maritime Con- 
tinent the low <5D increases toward the prior by up to 35 %c. 


Figure 15 shows the result of applying the different cat- 
egorical TES operators. The changes in 5D are similar to 
the retrieval-based operator in that regions of low raw Mod- 
elE <5D tend to increase toward the TES prior, but there are 
significant regional differences for the C and PW catego- 
rizations. Using the C categorization (Fig. 15a), there is a 
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Fig. 12. Approximated (height of peak HDO sensitivity) for four representative categorizations. 
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Fig. 13. Approximated (height of peak HDO sensitivity) as a function of PWp over the ocean (top) and T$ over land (bottom) for four 
representative categorizations. Grey dashed lines show the 95 % prediction intervals. Black dashed lines show the corresponding linear fits 
from TES observations in Fig. 4b and d. 


strong decrease in <5D over the anticyclones in the Pacific 
and Atlantic, despite the raw ModelE <5D not being partic- 
ularly high. This is due to the effect of not including PW in 
the categorization and consequently not capturing the vari- 
ation in /?]> Using only clouds in the categorization, these 
regions are simply classified as having low CF, and will be 


associated with averaging kernel shapes similar to those in 
Fig. 6a. This averaging kernel is inappropriate, however, as 
it does not capture the higher pn associated with the PWp 
less than 10mm (Fig. 7c) which occurs in those regions. As 
a result, the mid-tropospheric 5D composition, which is low, 
has an overly strong influence in applying Eq. (1), resulting 
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Fig. 14. (a) Raw ModelE vapor <5D between 825 and 511 hPa during 2006 and 2009 for all months (b) ModelE vapor <5D after application of 
the retrieval based operator (c) difference between (b) and (a). The vertical mean of SD is weighted by specific humidity and pressure. 


a) 5D (prml): ModelE_C - ModelE OcnMean: 7.0 LndMean: 3.3 



d) 5D (prml): ModelE_LOCTPW - ModelE OcnMean: 9.1 LndMean: 4.8 



Fig. 15. Same as Fig. 14c, but for the categorical TES operator with the four representative categorizations. 


in an overly strong <5D decrease. Using the PW categoriza- 
tion (Fig. 15b), this problem is absent, but there is a weaker 
increase in 5D over the western Pacific warm pool. The more 
complex categorizations result in similar changes to the <SD 
field (Fig. 15c, d), not varying by more than \ %o in their 
overall mean and with only small regional differences. With 
a sufficient CF control on retrieval quality and PWp control 
on pd, the deficiencies over the ocean for the C and PW cat- 
egorizations are absent for each. 

The ModelE <5D changes for the categorical operators re- 
sult from approximating the controls on retrieval quality and 
Pd using conditions in the model, rather than from collocated 
TES retrievals. They are accurate to the extent that the ap- 
proximated controls in Fig. 1 1 and Fig. 13 agree with the ob- 


servational controls in Fig. 4. Focusing on the full LOCTPW 
categorization, the most significant deficiency was the PWp 
control on pd over the ocean (Fig. 13d), where the approx- 
imated slope was — 1.6hPamm _1 too strong compared to 
observations. We can see, however, that while the slope for 
the LOrTPWp categorization was only — 1.2hPamm _1 too 
strong, this translated into less than a 1 %c difference in the 
mean change in <5D over the ocean from the LOCPTW cate- 
gorization (Fig. 15c, d). This suggests that if a categorization 
existed that more closely approximated the observed PWp 
control on pn in the observations, this would not likely result 
in change of more than several %c to the transformed <5D field, 
ignoring the contributions of other secondary controls. This 
provides a sense of the maximum error in the transformed <5D 
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a) 5D (prml): Model E_LOCTPW_NoOrbit - ModelE OcnMean: 9.2 LndMean: 6.2 
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Fig. 16. Same as Fig. 15, but for two LOCTPW sensitivity tests: (a) full daytime sampling and not just along the orbital path (b) a fixed FbO 
prior. 



field associated with errors in quality filtering and averaging 
kernel selection. We note also that in this case, the change in 
<5D for the retrieval-based and LOCTPW categorical operator 
were very similar, owing to the agreement in the underlying 
PWp fields, and because of the shared HDO prior and raw 
model <5D fields. 

6.2 Sensitivity tests 

To further understand how the change in <5D might vary with 
different configurations, we examined the sensitivity of the 
LOCPTW-based operator to the effects of orbital sampling, 
a fixed FLO prior and also the performance outside of 
the tropics. 

The effect of sampling the model at all points and not just 
along the TES orbital path was primarily a smoother trans- 
formed field (Fig. 16a) compared to without (Fig. 15e) owing 
to a much greater sampling frequency. Aghedo et al. (2011) 
found that the effects of orbital path sampling were also min- 
imal on modeled CO, O 3 , temperature and FLO at a monthly 
scale. Voulgarakis et al. (2011) also reached to a similar con- 
clusion regarding the correlation between daily O 3 and CO. 
The TES sampling frequency is therefore sufficient to cap- 
ture variability in the model over several years, although it 
remains to be seen whether this is the case at shorter time 
scales. 

Unique to the joint TES HDO/ FLO retrievals is the use of 
a changing FLO prior jclj 1 . It must also be chosen in apply- 
ing the TES operator, representing another potential source 
of categorical selection error. We assume that the quality of 
averaging kernel selection for the Ahh. Adh and Ahd oper- 
ators for different categorizations follows that of Add- As a 
test of the importance of jrjj 1 selection on the TES operator 
in Eq. (1), we fixed to the constant profile of the “Sin- 
gle” categorization, but with the averaging kernels still cho- 
sen from the LOCTPW categorization. This had little effect 
(Fig. 16b), which likely means that the Ahh and Adh terms 
are typically very similar (as was the case for the example 
profile in Fig. 2), and that the strength of TES operator is 
largely controlled by the second term on the RHS of Eq. (1). 


The focus of future comparisons between the modeled and 
observed <5D fields will be over the tropics, following a se- 
ries of recent studies (Lee et al., 2011; Kurita et al., 2011; 
Berkelhammer et al., 2012; Kim et al., 2012). For broader 
potential application, however, we tested the performance of 
the TES HDO simulator outside of the tropical domain. The 
LOCTPW categorization was re-calculated from TES mea- 
surements over 60° S to 60° N. The range of the surface tem- 
perature categories was increased from 260 K to 330 K to 
capture a wider observed temperature range. Model simu- 
lations were run with the TES operators applied over 60° S 
to 60° N. To assess performance outside of the tropics, we 
examine the degree to which observed variation in relation- 
ship strength by latitude is captured by the categorical TES 
operator. 

Figure 17 shows the correlation between retrieval quality 
and pd and the primary control variables at different lati- 
tudes. Observed retrieval quality over the oceans (Fig. 17a) 
remains negatively correlated with CF, weakening slightly 
at high northern latitudes. The retrieval -based operator per- 
forms poorly in capturing this association, but the categor- 
ical operator performs well. Over land (Fig. 17c), the ob- 
served negative correlation between retrieval quality and 7g 
becomes positive at high latitudes, presumably due to the 
covariation moving poleward between 7s and atmospheric 
moisture content. This change is captured by both operators, 
but too sharply in the case of the categorical operator. 

The associations between po and the primary control vari- 
ables are not generally well-captured over the wider latitude 
range. Over the ocean (Fig. 17b), the overly-strong nega- 
tive correlation between po and PWp over the tropics com- 
pared to observations (in Table 1) increases moving pole- 
ward. The observed decrease in correlation outside of the 
tropics is captured to some degree by the categorical oper- 
ator, but with a lag, and nor is there any modeled rebound in 
correlation at high latitudes. Over land, there is an observed 
positive relationship between 7s and po across all latitudes 
(Fig. 2d). This is poorly captured by the categorical opera- 
tor, for which there is no correlation between 40° S and 0°. 
In fact, over land, when extratropical TES measurements are 
included in calculating the categorization, the performance 
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Fig. 17. Correlation between retrieval quality and pd (height of 
peak HDO sensitivity) with primary predictor variables over dif- 
ferent latitudes. 

of the operator is degraded in the tropics. When the catego- 
rization is calculated only from TES measurements between 
15S and 15N, the correlation between pn and 7s of 0.64 
is in good agreement with the observed correlation of 0.5 1 . 
When the operator is based on measurements between 60° S 
to 60° N, however, the correlation over 15° S to 15° N is 0. 
So not only is prediction of pn in the extra tropics poor, but 
it contaminates the fairly good performance over the trop- 
ical land shown in Fig. 13j. Application of the categorical 
TES operator outside of the tropics will likely require that 
latitude-specific categorizations be computed from the TES 
retrievals, and possibly that other control variables be con- 
sidered. 

6.3 Comparison with TES <S1) 

Comparisons between the TES and ModelE <5D are shown in 
Fig. 18. The raw ModelE <5D is on average 17 %c lower than 
TES over the ocean and 41 %c lower than TES over land, but 
with negative biases of up to 63 %c and 96 %c over each, re- 
spectively (Fig. 18b). The negative bias over the ocean oc- 
curs over the tropical rain bands and in the dry regions off of 
the west South American and central African coasts. In the 
latter cases, the bias likely results from outflow of strongly 
depleted vapor due to continental convection. 

The negative bias over the ocean is reduced to ~ 7 %c af- 
ter applying either the retrieval-based (Fig. 18c) or categor- 
ical (Fig. 18d) TES operators, and more weakly reduced to 
~ 35 %c over land. The changes in bias over the ocean are 
interpreted as follows. Where there is heavy, precipitating 


cloud, observed retrieval quality is lower (Fig. 3a). Because 
precipitation tends to lower vapor <5D (e.g. Lee and Fung, 
2008), this introduces an observational bias toward higher <5D 
through the exclusion of retrievals under cloudy and lower 
<5D conditions, and relaxation toward a prior constraint with 
higher <5D. By applying the TES operator, these effects are 
captured (Figs. 14c, 15e) leading to the more accurate com- 
parisons in Fig. 18c, d. It also becomes more apparent that 
the model bias toward lower <5D is specific to a model pro- 
cess over land. It was beyond the scope of this paper to un- 
derstand these biases, but immediate candidates that will be 
investigated in the future are too-strong continental convec- 
tion and too-weak transpiration. 

7 Discussion 

Changes to the raw model <5D over the tropics from applying 
the TES operators were large. Over the ocean, the mean in- 
crease in modeled <5D from applying the TES operator was 
9 %o, and was up to 30 %c over regions with low, raw <5D such 
as the west Pacific warm pool. Over land, there was a mean 
increase of 6%c, but with increases of up to 30 %c, and de- 
creases of up to 40 %c over northeastern Africa where raw 
<5D is very high. 

To put these changes in context, they are of the same order 
as the <5D model biases in previous comparisons against the 
TES <5D retrievals. Yoshimura et al. (2011) saw a systematic 
bias of — 20 %c in the IsoGSM model over the same vertical 
layer. Risi et al. (2012) saw a bias of 30 %c in their compari- 
son of LMDz at 619 hPa. That the regional differences to the 
raw ModelE <5D fields resulting from the TES operator are 
of the same magnitude confirm its importance in any quanti- 
tative comparison between the model and satellite measure- 
ments. Similarly, Aghedo et al. (2011) determined that the 
error associated with not applying the TES retrieval operator 
to retrieved CO, O3, temperature, and particularly FEO, was 
much larger than the error associated with monthly averaging 
or the absence of orbital sampling. 

The changes in <5D for the cloud-only categorization were 
unrealistic owing to poor pn approximation. For this nudged 
simulation, the new <5D fields for retrieval-based and full 
LOCTPW categorical operators were in good agreement be- 
cause of the similarity of their PWp and 7s fields and because 
of accurate mapping of these quantities to a suitable averag- 
ing kernel. The LOrPWp categorization generally performed 
well through its inclusion of the most important controls on 
retrieval quality and pn, and has the advantage of having far 
fewer categories, but the influence of 7s on pn over land 
was too strong. The accuracy of the modeled PWp field is 
likely the result of the nudged, large-scale control on the hu- 
midity field and averaging over four years. It is doubtful that 
this agreement will be the case for free-running simulations 
with strongly perturbed physics or over shorter time scales, 
in which case the categorical operator would be more appro- 
priate. 
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d) 8D (prml): ModelE_LOCTPW - TES OcnMean: -7.4 LndMean: -35.8 
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Fig. 18. (a) Retrieved TES <5D ( %c ). Difference between ModelE and TES for: (b) raw model <5D (c) model <5D after applying the retrieval- 
based operator (d) model <5D after applying the LOCTPW categorical operator. 


Particularly for retrieval quality, the categorical operator 
performed poorly over land compared to the ocean. One fac- 
tor is simply that estimates of categorical retrieval quality av- 
eraging kernel structure over land will be less robust because 
there are fewer TES measurements. More importatntly is that 
there are likely additional factors influencing retrieval qual- 
ity and averaging kernel structure over land that we have not 
considered. For pn in particular, the observational controls 
over land were weaker (Table 1), making their approxima- 
tion in the simulator more difficult. As the categorical oper- 
ator evolves, we will start by testing topography, land cover 
type, and, related to both, thermal contrast between ground 
and air, which will be greater over land than ocean. In the lat- 
ter case, the apparently worse performance over land could 
be because we considered daytime retrievals only. 

Further refinements will be required to use the categorical 
operator outside of the tropics. Over the oceans, more PWb 
and PWp categories will be required at the low ends of their 
scales, assuming that vertical moisture gradients continue 
to be the dominant control on pn outside of the tropics. 
Any improvements that are obtained over land in the tropics 
should improve performance in the extratropics, particularly 
in the northern hemisphere. We hope to avoid computing the 
categorizations separately for different latitude bands, but 
this might be inevitable. 

Isotopic constraints provide a new way of assessing GCM 
simulations of processes which are highly sensitive to per- 
turbed cloud physics, such as those driving the Madden- 
Julian Oscillation (MJO). Berkelhammer et al. (2012) sep- 
arated the contributions of evaporative and convergent mois- 


ture phases during different phases of the MJO. Kim et 
al. (2012) showed how the absence of an MJO in the de- 
fault AR5 version of ModelE could be rectified by increas- 
ing the entrainment and reevaporation strength in the convec- 
tive parameterization, but at the expense of the mean state of 
precipitation. It would be instructive to compare the isotopic 
response of these changes to TES HDO retrievals, given the 
sensitivity of isotopic composition to these types of processes 
(Worden et al., 2007b; Lee et al., 2009; Field et al., 2010). 
The categorical TES operator provides a means of doing this 
for arbitrary convective configurations. 

In comparisons between retrieved and simulated HDO for 
other models, regardless of which operator approach is taken, 
or some other approach, we suggest looking at the agreement 
between retrieved and modeled CF, PWp and 7s. This will 
give a sense of how appropriate the retrieval quality filtering 
and averaging kernel selection is for the modeled meteorol- 
ogy, particularly as observational constraints are weakened 
with free-running perturbed physics experiments. It remains 
to be seen how the categorical approach performs for free- 
running model simulations or for other isotopically-equipped 
AGCMs. The modeled retrieval quality and p d fields (i.e. 
in Figs. 10, 12) will change to the extent that the under- 
lying control fields change, or rather, to the extent that the 
covariation between the control variables changes. One po- 
tential weakness is that a new model configuration will have 
an increase in the frequency of conditions corresponding to 
categories that were not well populated by TES measure- 
ments and for which the retrieval quality and mean averaging 
kernels are less robust (although the opposite could also be 
true). This type of evaluation could also be extended to other 
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species, such as O 3 and CO, after identifying the strongest 
controls on their retrieval quality and averaging kernel struc- 
ture, as could the categorical TES operator for use in non- 
nudged composition-climate model evaluation. We note that 
cloud cover and surface temperature will likely play an im- 
portant role for most species, but the importance of atmo- 
spheric moisture content is likely specific to HDO. 
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